selection model
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Asia > Indonesia > Bali (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Asia > Indonesia > Bali (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Africa > Tanzania > Arusha Region > Arusha (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL
Liu, Yifu, Zhu, Yin, Gao, Yingqi, Luo, Zhiling, Li, Xiaoxia, Shi, Xiaorong, Hong, Yuntao, Gao, Jinyang, Li, Yu, Ding, Bolin, Zhou, Jingren
To leverage the advantages of LLM in addressing challenges in the Text-to-SQL task, we present XiYan-SQL, an innovative framework effectively generating and utilizing multiple SQL candidates. It consists of three components: 1) a Schema Filter module filtering and obtaining multiple relevant schemas; 2) a multi-generator ensemble approach generating multiple highquality and diverse SQL queries; 3) a selection model with a candidate reorganization strategy implemented to obtain the optimal SQL query. Specifically, for the multi-generator ensemble, we employ a multi-task fine-tuning strategy to enhance the capabilities of SQL generation models for the intrinsic alignment between SQL and text, and construct multiple generation models with distinct generation styles by fine-tuning across different SQL formats. The experimental results and comprehensive analysis demonstrate the effectiveness and robustness of our framework. Overall, XiYan-SQL achieves a new SOTA performance of 75.63% on the notable BIRD benchmark, surpassing all previous methods. It also attains SOTA performance on the Spider test set with an accuracy of 89.65%.
- North America > United States (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation
Wu, Zhanglin, Wei, Daimeng, Chen, Xiaoyu, Shang, Hengchao, Guo, Jiaxin, Li, Zongyao, Luo, Yuanchang, Yang, Jinlong, Rao, Zhiqiang, Yang, Hao
Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT). However, using LLMs for translation suffers from high computational costs and significant latency. Based on our evaluation, in most cases, translations using LLMs are comparable to that generated by neural machine translation (NMT) systems. Only in particular scenarios, LLM and NMT models show respective advantages. As a result, integrating NMT and LLM for translation and using LLM only when necessary seems to be a sound solution. A scheduling policy that optimizes translation result while ensuring fast speed and as little LLM usage as possible is thereby required. We compare several scheduling policies and propose a novel and straightforward decider that leverages source sentence features. We conduct extensive experiments on multilingual test sets and the result shows that we can achieve optimal translation performance with minimal LLM usage, demonstrating effectiveness of our decider.
A Semi-supervised CART Model for Covariate Shift
Cai, Mingyang, Klausch, Thomas, van de Wiel, Mark A.
Machine learning models used in medical applications often face challenges due to the covariate shift, which occurs when there are discrepancies between the distributions of training and target data. This can lead to decreased predictive accuracy, especially with unknown outcomes in the target data. This paper introduces a semi-supervised classification and regression tree (CART) that uses importance weighting to address these distribution discrepancies. Our method improves the predictive performance of the CART model by assigning greater weights to training samples that more accurately represent the target distribution, especially in cases of covariate shift without target outcomes. In addition to CART, we extend this weighted approach to generalized linear model trees and tree ensembles, creating a versatile framework for managing the covariate shift in complex datasets. Through simulation studies and applications to real-world medical data, we demonstrate significant improvements in predictive accuracy. These findings suggest that our weighted approach can enhance reliability in medical applications and other fields where the covariate shift poses challenges to model performance across various data distributions.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Health & Medicine > Therapeutic Area > Oncology (0.93)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.93)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Neural Solver Selection for Combinatorial Optimization
Gao, Chengrui, Shang, Haopu, Xue, Ke, Qian, Chao
Machine learning has increasingly been employed to solve NP-hard combinatorial optimization problems, resulting in the emergence of neural solvers that demonstrate remarkable performance, even with minimal domain-specific knowledge. To date, the community has created numerous open-source neural solvers with distinct motivations and inductive biases. While considerable efforts are devoted to designing powerful single solvers, our findings reveal that existing solvers typically demonstrate complementary performance across different problem instances. This suggests that significant improvements could be achieved through effective coordination of neural solvers at the instance level. In this work, we propose the first general framework to coordinate the neural solvers, which involves feature extraction, selection model, and selection strategy, aiming to allocate each instance to the most suitable solvers. To instantiate, we collect several typical neural solvers with state-of-the-art performance as alternatives, and explore various methods for each component of the framework. We evaluated our framework on two extensively studied combinatorial optimization problems, Traveling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP). Experimental results show that the proposed framework can effectively distribute instances and the resulting composite solver can achieve significantly better performance (e.g., reduce the optimality gap by 0.88\% on TSPLIB and 0.71\% on CVRPLIB) than the best individual neural solver with little extra time cost.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
- (12 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.82)
Crafting Narrative Closures: Zero-Shot Learning with SSM Mamba for Short Story Ending Generation
Sharma, Divyam, Santhanam, Divya
Writing stories is an engaging yet challenging endeavor. Often, authors encounter moments of creative block, where the path forward in their narrative becomes obscured. This paper is designed to address such moments by providing an innovative solution: A tool that completes stories based on given prompts. By inputting a short story prompt, users can receive a conclusion to their story, articulated in one sentence or more, thereby enhancing the storytelling process with AI-driven creativity. This tool aims not only to assist authors in navigating writer's block but also to offer a fun and interactive way for anyone to expand on story ideas spontaneously. Through this paper, we explore the intersection of artificial intelligence and creative writing, pushing the boundaries of how stories can be crafted and concluded. To create our final text-generation models, we used a pre-trained GPT-3.5 model and a newly created finetuned SSM-Mamba model, both of which perform well on a comprehensive list of metrics including BERT score, METEOR, BLEU, ROUGE, and Perplexity. The SSM model has also been made public for the NLP community on HuggingFace models as an open source contribution, which for the timebeing is a first of its kind state-space model for story-generation task on HuggingFace.
CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Pourreza, Mohammadreza, Li, Hailong, Sun, Ruoxi, Chung, Yeounoh, Talaei, Shayan, Kakkar, Gaurav Tarlok, Gan, Yu, Saberi, Amin, Ozcan, Fatma, Arik, Sercan O.
In tackling the challenges of large language model (LLM) performance for Text-to-SQL tasks, we introduce CHASE-SQL, a new framework that employs innovative strategies, using test-time compute in multi-agent modeling to improve candidate generation and selection. CHASE-SQL leverages LLMs' intrinsic knowledge to generate diverse and high-quality SQL candidates using different LLM generators with: (1) a divide-and-conquer method that decomposes complex queries into manageable sub-queries in a single LLM call; (2) chain-of-thought reasoning based on query execution plans, reflecting the steps a database engine takes during execution; and (3) a unique instance-aware synthetic example generation technique, which offers specific few-shot demonstrations tailored to test questions.To identify the best candidate, a selection agent is employed to rank the candidates through pairwise comparisons with a fine-tuned binary-candidates selection LLM. This selection approach has been demonstrated to be more robust over alternatives. The proposed generators-selector framework not only enhances the quality and diversity of SQL queries but also outperforms previous methods. Overall, our proposed CHASE-SQL achieves the state-of-the-art execution accuracy of 73.0% and 73.01% on the test set and development set of the notable BIRD Text-to-SQL dataset benchmark, rendering CHASE-SQL the top submission of the leaderboard (at the time of paper submission).
- North America > Aruba (0.05)
- Europe > Czechia (0.04)
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
- (4 more...)
- Workflow (1.00)
- Research Report (1.00)
- Education (1.00)
- Health & Medicine (0.68)
- Leisure & Entertainment > Sports (0.46)